<!doctype html>
<html>
  <head>
    <!-- MathJax -->
    <script type="text/javascript"
      src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
    </script>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="chrome=1">
    <title>
      Caffe | CIFAR-10 tutorial
    </title>

    <link rel="icon" type="image/png" href="/images/caffeine-icon.png">

    <link rel="stylesheet" href="/stylesheets/reset.css">
    <link rel="stylesheet" href="/stylesheets/styles.css">
    <link rel="stylesheet" href="/stylesheets/pygment_trac.css">

    <meta name="viewport" content="width=device-width, initial-scale=1, user-scalable=no">
    <!--[if lt IE 9]>
    <script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
    <![endif]-->
  </head>
  <body>
  <script>
    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
    })(window,document,'script','//www.google-analytics.com/analytics.js','ga');

    ga('create', 'UA-46255508-1', 'daggerfs.com');
    ga('send', 'pageview');
  </script>
    <div class="wrapper">
      <header>
        <h1 class="header"><a href="/">Caffe</a></h1>
        <p class="header">
          Deep learning framework by <a class="header name" href="http://bair.berkeley.edu/">BAIR</a>
        </p>
        <p class="header">
          Created by
          <br>
          <a class="header name" href="http://daggerfs.com/">Yangqing Jia</a>
          <br>
          Lead Developer
          <br>
          <a class="header name" href="http://imaginarynumber.net/">Evan Shelhamer</a>
        <ul>
          <li>
            <a class="buttons github" href="https://github.com/BVLC/caffe">View On GitHub</a>
          </li>
        </ul>
      </header>
      <section>

      <h1 id="alexs-cifar-10-tutorial-caffe-style">Alex’s CIFAR-10 tutorial, Caffe style</h1>

<p>Alex Krizhevsky’s <a href="https://code.google.com/p/cuda-convnet/">cuda-convnet</a> details the model definitions, parameters, and training procedure for good performance on CIFAR-10. This example reproduces his results in Caffe.</p>

<p>We will assume that you have Caffe successfully compiled. If not, please refer to the <a href="/installation.html">Installation page</a>. In this tutorial, we will assume that your caffe installation is located at <code class="highlighter-rouge">CAFFE_ROOT</code>.</p>

<p>We thank @chyojn for the pull request that defined the model schemas and solver configurations.</p>

<p><em>This example is a work-in-progress. It would be nice to further explain details of the network and training choices and benchmark the full training.</em></p>

<h2 id="prepare-the-dataset">Prepare the Dataset</h2>

<p>You will first need to download and convert the data format from the <a href="http://www.cs.toronto.edu/~kriz/cifar.html">CIFAR-10 website</a>. To do this, simply run the following commands:</p>

<div class="highlighter-rouge"><pre class="highlight"><code>cd $CAFFE_ROOT
./data/cifar10/get_cifar10.sh
./examples/cifar10/create_cifar10.sh
</code></pre>
</div>

<p>If it complains that <code class="highlighter-rouge">wget</code> or <code class="highlighter-rouge">gunzip</code> are not installed, you need to install them respectively. After running the script there should be the dataset, <code class="highlighter-rouge">./cifar10-leveldb</code>, and the data set image mean <code class="highlighter-rouge">./mean.binaryproto</code>.</p>

<h2 id="the-model">The Model</h2>

<p>The CIFAR-10 model is a CNN that composes layers of convolution, pooling, rectified linear unit (ReLU) nonlinearities, and local contrast normalization with a linear classifier on top of it all. We have defined the model in the <code class="highlighter-rouge">CAFFE_ROOT/examples/cifar10</code> directory’s <code class="highlighter-rouge">cifar10_quick_train_test.prototxt</code>.</p>

<h2 id="training-and-testing-the-quick-model">Training and Testing the “Quick” Model</h2>

<p>Training the model is simple after you have written the network definition protobuf and solver protobuf files (refer to <a href="../examples/mnist.html">MNIST Tutorial</a>). Simply run <code class="highlighter-rouge">train_quick.sh</code>, or the following command directly:</p>

<div class="highlighter-rouge"><pre class="highlight"><code>cd $CAFFE_ROOT
./examples/cifar10/train_quick.sh
</code></pre>
</div>

<p><code class="highlighter-rouge">train_quick.sh</code> is a simple script, so have a look inside. The main tool for training is <code class="highlighter-rouge">caffe</code> with the <code class="highlighter-rouge">train</code> action, and the solver protobuf text file as its argument.</p>

<p>When you run the code, you will see a lot of messages flying by like this:</p>

<div class="highlighter-rouge"><pre class="highlight"><code>I0317 21:52:48.945710 2008298256 net.cpp:74] Creating Layer conv1
I0317 21:52:48.945716 2008298256 net.cpp:84] conv1 &lt;- data
I0317 21:52:48.945725 2008298256 net.cpp:110] conv1 -&gt; conv1
I0317 21:52:49.298691 2008298256 net.cpp:125] Top shape: 100 32 32 32 (3276800)
I0317 21:52:49.298719 2008298256 net.cpp:151] conv1 needs backward computation.
</code></pre>
</div>

<p>These messages tell you the details about each layer, its connections and its output shape, which may be helpful in debugging. After the initialization, the training will start:</p>

<div class="highlighter-rouge"><pre class="highlight"><code>I0317 21:52:49.309370 2008298256 net.cpp:166] Network initialization done.
I0317 21:52:49.309376 2008298256 net.cpp:167] Memory required for Data 23790808
I0317 21:52:49.309422 2008298256 solver.cpp:36] Solver scaffolding done.
I0317 21:52:49.309447 2008298256 solver.cpp:47] Solving CIFAR10_quick_train
</code></pre>
</div>

<p>Based on the solver setting, we will print the training loss function every 100 iterations, and test the network every 500 iterations. You will see messages like this:</p>

<div class="highlighter-rouge"><pre class="highlight"><code>I0317 21:53:12.179772 2008298256 solver.cpp:208] Iteration 100, lr = 0.001
I0317 21:53:12.185698 2008298256 solver.cpp:65] Iteration 100, loss = 1.73643
...
I0317 21:54:41.150030 2008298256 solver.cpp:87] Iteration 500, Testing net
I0317 21:54:47.129461 2008298256 solver.cpp:114] Test score #0: 0.5504
I0317 21:54:47.129500 2008298256 solver.cpp:114] Test score #1: 1.27805
</code></pre>
</div>

<p>For each training iteration, <code class="highlighter-rouge">lr</code> is the learning rate of that iteration, and <code class="highlighter-rouge">loss</code> is the training function. For the output of the testing phase, <strong>score 0 is the accuracy</strong>, and <strong>score 1 is the testing loss function</strong>.</p>

<p>And after making yourself a cup of coffee, you are done!</p>

<div class="highlighter-rouge"><pre class="highlight"><code>I0317 22:12:19.666914 2008298256 solver.cpp:87] Iteration 5000, Testing net
I0317 22:12:25.580330 2008298256 solver.cpp:114] Test score #0: 0.7533
I0317 22:12:25.580379 2008298256 solver.cpp:114] Test score #1: 0.739837
I0317 22:12:25.587262 2008298256 solver.cpp:130] Snapshotting to cifar10_quick_iter_5000
I0317 22:12:25.590215 2008298256 solver.cpp:137] Snapshotting solver state to cifar10_quick_iter_5000.solverstate
I0317 22:12:25.592813 2008298256 solver.cpp:81] Optimization Done.
</code></pre>
</div>

<p>Our model achieved ~75% test accuracy. The model parameters are stored in binary protobuf format in</p>

<div class="highlighter-rouge"><pre class="highlight"><code>cifar10_quick_iter_5000
</code></pre>
</div>

<p>which is ready-to-deploy in CPU or GPU mode! Refer to the <code class="highlighter-rouge">CAFFE_ROOT/examples/cifar10/cifar10_quick.prototxt</code> for the deployment model definition that can be called on new data.</p>

<h2 id="why-train-on-a-gpu">Why train on a GPU?</h2>

<p>CIFAR-10, while still small, has enough data to make GPU training attractive.</p>

<p>To compare CPU vs. GPU training speed, simply change one line in all the <code class="highlighter-rouge">cifar*solver.prototxt</code>:</p>

<div class="highlighter-rouge"><pre class="highlight"><code># solver mode: CPU or GPU
solver_mode: CPU
</code></pre>
</div>

<p>and you will be using CPU for training.</p>


      </section>
  </div>
  </body>
</html>
